No need for conspiracy: Self-organized cartel formation in a modified trust game 
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We investigate the dynamics of a trust game on a mixed population where individuals with the role 
of buyers are forced to play against a predetermined number of sellers, who they choose dynamically. 
Agents with the role of sellers are also allowed to adapt the level of value for money of their products, 
based on payoff. The dynamics undergoes a transition at a specific value of the strategy update rate, 
above which an emergent cartel organization is observed, where sellers have similar values of below 
optimal value for money. This cartel organization is not due to an explicit collusion among agents; 
instead it arises spontaneously from the maximization of the individual payoffs. This dynamics is 
marked by large fluctuations and a high degree of unpredictability for most of the parameter space, 
and serves as a plausible qualitative explanation for observed elevated levels and fluctuations of 
certain commodity prices. 



Modern societies are complex systems, where observed 
macroscopic properties are the emergent result of collec- 
tive actions of individual agents. The most commonly 
adopted scenario assumes that agents select strategies 
which are perceived to be in their own best interest. This 
decision, however, must often be made without full a pri- 
ori knowledge of the most likely outcome, and thus must 
rely on some notion of belief, or trust. Common examples 
include most types of markets, where buyers must decide 
if a certain product is worth its cost, and sellers must de- 
cide which price they should assign for their products. At 
the most fundamental level, this problem can be framed 
as a trust game [IH3] , where a buyer must decide whether 
he buys a product at a given cost, and the seller decides 
which cost to select. If the price is perceived to be fair by 
both parties, the outcome is positive for both of them, 
otherwise it slants in favor of either party. A real market, 
however, is composed of many buyers and sellers, and de- 
pending on the situation, buyers do not have the option 
of not buying, instead they can only realistically choose 
from whom they buy. This is often the case, for instance, 
for car owners who must buy gasoline, people who must 
buy groceries, bank account and credit card owners, etc. 
In this Letter, we investigate the dynamics of a trust 
game on a population of agents who face this restriction, 
and form an adaptive network of interactions [4HT2] . We 
identify the emergence of an effective cartel-like dynam- 
ics, where agents share low values of value for money, to 
the overall benefit of the sellers and detriment of the buy- 
ers. This cartel dynamics emerges without any explicit 
collusion among the agents, who react independently in 
order to maximize their payoff. In this dynamical phase, 
the evolution of the average value for money in the pop- 
ulation is marked by very large fluctuations, and high 
degree of unpredictability, with aperiodic behavior and 
very broad spectral densities. These variations are a 
result of a never-ending tug-of-war between sellers and 
buyers, where buyers seek the best sellers, who in turn 
compete among themselves, while at the same time ben- 
efiting collectively from uniformly low value for money. 
This type of dynamics can be directly compared to the 



time evolution of certain commodity prices such as gaso- 
line, which is known to fluctuate considerably between 
gas stations fT3HT5] . both in space and time, sometimes 
with multiple price changes within a single day, without 
any apparent connection to the fluctuation of crude oil 
prices. Our model provides a conceptual explanation of 
the origin of such fluctuations, which does not require 
the explicit collusion among the sellers as a necessary 
element driving price changes. 

Our model is defined as follows. We consider a pop- 
ulation of N agents, where each agent has two simulta- 
neous roles: donator (e.g. a buyer) and rewarder (e.g. a 
seller) . To each agent i we assign a value for money vari- 
able Wi G [0, 1]. This can be interpreted, for instance, as 
the quality of a sold product or service. Each agent is 
forced to choose exactly K rewarders to whom it must 
donate [16J. This forms a network of N nodes, where 
the adjacency matrix Aij describes the donators' choices. 
Each agent i has a donator and a rewarder payoff, 
and P~, respectively, defined as, 
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P~ = (l- Wi)ki, (2) 

where ki = ^ ■ Aji is the number of donators who choose 
agent i (the in-degree of i). Eq. [I] can be interpreted 
simply as the overall satisfaction a customer has with his 
buying choices, and Eq. [2] as the overall profit a business 
makes, which is assumed proportional to how many cus- 
tomers it has, hi, and to the complement of the value for 
money it provides, 1 — Wi. We assume these payoff values 
correspond to continued interactions between players, in- 
stead of single isolated events (e.g. repeated games after 
an unspecified number of rounds), and are accumulated 
on a time scale which is much faster than the strategy up- 
date dynamics. The strategies of each agent correspond 
to their chosen value for money w^, and their choice of 
rewarders. These strategies are updated dynamically as 
follows. At each time step, a agent i is randomly cho- 
sen. With probability a its rewarder strategy is updated, 
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otherwise its donator strategy is updated. The actual 
strategy updates are performed according to the follow- 
ing rules: 

1. Donator update. For the agent z, a random, cur- 
rently chosen rewarder j is selected, so that = 
1, and compared with another rewarder I ^ j, with 
An — 0, randomly chosen among the entire popu- 
lation. If w\ > wj, then the rewarder is replaced, 
i.e. Aij — » and An — >• 1. 

2. Rewarder update. For the agent z, another agent 
j ^ i is randomly selected from the population. If 
P~ > P~, then its value for money is copied, i.e. 

Wi ^— Wj. 

Thus the donator strategies are updated by simple com- 
parison, and the rewarder strategies are updated by repli- 
cation. This is so chosen, since it is always better for a 
donator to switch to a better rewarder, whereas it is not 
a priori obvious which is the best value for money a re- 
warder should select: If the value w for a given agent is 
lowered, the higher will be its payoff immediately, but 
on the other hand, the larger is the likelihood it will lose 
donators as soon as they update their strategies. Con- 
versely, if the value of w is increased, it will decrease 
the rewarders payoff immediately, but it may attract 
more donators in the future, which will then cause an 
increased payoff. Replication based on payoff automati- 
cally chooses the strategies which are more successful at 
a given stage, and is thus the most commonly adopted 
scenario in evolutionary game theory |3J. 

We investigate the dynamics of this model by simulat- 
ing a population of N = 10 6 agents, as well as obtaining 
some properties analytically in the limit N — >> oo. In 
order to avoid absorbing states where a single value for 
money is fixated on the entire population, we introduce 
a small noise probability r = 10 -6 that at each time 
step a randomly chosen agent acquires a random value of 
w G [0, 1]. The dynamics will depend strongly on the pa- 
rameter a, which controls the relative speed with which 
rewarders update their strategies, when compared to do- 
nators. If this value is too low, the donators will react 
fast enough to changes in available value for money, se- 
lecting those with higher w values, and only these agents 
will have a larger rewarder payoff, and thus the dynam- 
ics will settle on a stable fixed point where the entire 
population has the same value of w, witch tends asymp- 
totically to one, as can be seen is Fig.[TJi. In this situation 
all rewarders will on average receive the same number of 
donators, which will be distributed according to a Pois- 
son. This is the ideal scenario for donators, but the worst 
possible for rewarders, since their average payoff values 
reach their maximum and minimum values, respectively. 
However, as the remaining panels of Fig. [T] show, this sit- 
uation changes as a is increased. For values of a > a c , 
where a c is a critical value depending on K, the aver- 
age value for money (w) fluctuates around values smaller 



than one, since the rewarders are quick enough to copy 
low values of w, so that there are few higher values of w 
left in the population, before the donators have a chance 
to react. The values of w remain low since there are no 
other options for the donators to choose from. This is an 
emergent cartel-like dynamical phase, since all rewarders 
have settled on a range of w values which is beneficial to 
the entire population of rewarders, and detrimental to the 
population of donators, which are left with a restricted 
choice. However the values of (w) are not quite stable and 
fluctuate tremendously, due to influence of the donators 
and the always ongoing competition between rewarders. 

In Fig. [2^i can be seen the phase diagram for diverse 
values of K, which show the emergence of the cartel phase 
at a critical value of a c , after which the (w) = 1 ceases 
to be stable. The stability of this fixed point can be 
accessed by a linear stability analysis: If this fixed point 
is perturbed by the inclusion of a small fraction of agents 
with a lower value of w < 1, the time evolution of the 
probability density p{k, w) of agents with this w value 
and in-degree k is given by, 



d_ 
dt 



p(k,w) = aP (k)(l - 4,o) ^2 P( k '' w )- 
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where Po(k) is a Poisson distribution with average K, 
and terms of order 0(p(k\ w) 2 ) were neglected (see Eq.JZ] 
below for the full master equation). The first term of 
Eq. [3] corresponds simply to the probability of rewarders 
adopting the invading strategy, whereas the second term 
accounts for the probability of agents with the invading 
strategy losing donators. Eq. [3] is a linear system, and 
thus can be written in the form x = Mx, where Xi are 
the individual p(k, w) variables, and M is a matrix cor- 
responding to the right-hand side of Eq. [3] If the value 
of a is large enough so that the largest eigenvalue of M 
becomes larger than one, A > 1, the fixed point ceases to 
be stable. By numerically computing A, one can find the 
value of a = a c for which A = 1. These values predict 
exactly the transition point, as Fig. [2Jd shows. 

Exactly at the critical value a — a c , the oscillations 
of (w) show typical critical behavior, with sharp jumps 
from the average value close to one, corresponding to re- 
peated successful invasions of low value for money, which 
disappear after a relatively short time. Interestingly, the 
variance of the average value over time, cr( w ), is largest 
not exactly at the critical point, but at values a > a c 
close to it, as Fig. [3^i shows. Exactly for the values of 
a for which is maximum, one obtains an universal 
behavior for the in-degree distribution of agents (accu- 
mulated over the entire history), which exhibits a power 
law tail of the form P(k) ~ k~ 3 (see Fig. [3)3). For larger 
values of a the fluctuations diminish but remain signifi- 
cant. Indeed the P(k) distributions are broad for almost 
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FIG. 1. Time evolution of the average value for money (w), 
in a population of N = 10 6 agents with different values of K 
and a. The lower right panel (d) shows a zoomed region of 
the lower left panel (c). 
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FIG. 2. (a) Phase diagram showing the average value for 
money (w) over time, as a function of a, in a population of 
N = 10 6 agents with different values of K. The error bars 
indicate the variance of the time series (not of the average), 
(b) The same curves as in (a) scaled according to the critical 
value a c , calculated from Eq. [3] 



the entire parameter space, as the remaining panels of 
Fig. [3] show. 

A visual analysis of the temporal evolution of the val- 
ues of (w) cannot reveal the precise characteristics of 
the fluctuations, e.g. whether it is aperiodic or quasi- 
periodic. In Fig. [4] is shown the spectral density of the 
time series, revealing very broad spectra, compatible with 
aperiodic behavior. For values of a close to 1, a l/f a 
spectrum is clearly identified, with a = 3/2. For lower 
values of a, the spectrum is divided roughly into lower 
and high frequency regions, with stronger fluctuations at 
lower frequencies. The lower- frequency spectrum is sig- 
nificantly broad, and is compatible with a 1/f decay, 
with exponents in the range a G [3/2, 1/2]. 

The exact dynamics which give rise to the observed 
fluctuations can be explored more closely by specifying 
the full master equation which describes the behavior of 
the system in the limit N —> 00. For practical reasons, 
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FIG. 3. (a) Variance of the average value for money (w) over 
time, as a function of a, for different K. (b) Rescaled in- 
degree distribution, accumulated over the entire history, for 
the values of a for which a^ w ) is maximum. Lower panels: In- 
degree distributions for K = 5 (c) and 10 (d), and different 
values of a. 
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FIG. 4. Power spectral density for the time series of (w), and 
different values of K and a. The dotted and dashed lines show 
1/ f a curves with a = 3/2 and a = 1/2, respectively. 



we assume now that the values of Wi must be chosen from 
a discrete set of N w elements distributed uniformly in the 
[0, 1] range. Taking the limit N w — » 00, one recovers the 
exact same model as before. One can describe the time 
evolution of the probability P(fc, w) of observing a agent 
with value for money w and in-degree k as, 



d_ 
di 



P(fc, w) = aj(k : w) + (1 — a)£(fc, w) 



(4) 



where j(k,w) and £(k,w) describe the rewarder repli- 
cation and donator comparison dynamics, respectively. 
The term 7(fc, w) is defined as, 



l(k,w)= P(k,w')P k (k 



1 



W 



P(k,w) Pk(k^j,w f 
\ 1 — w' 



(5) 



where Pfc(fc, w) = J2k'>k ^(^'? w )- ^he first term in Eq.ji] 
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corresponds to the probability of randomly selected agent 
with a lower or equal payoff fc'(l — w') selecting w as the 
new strategy, and the second term to the probability of an 
agent with strategy w finding another agent with higher 
or equal payoff and a different value w' . The term £(k,w) 
describes the change in P(k,w) which is due to donators 
selecting different rewarders, and is given by, 



fc + 1 

k 

W) 



P(k + l,w) (P w (w)-P(k,w) 



P(k,w) (P w (w)-P(k-1 

k 



■P(k-l,w)[P*(w 



(k) 



P(k,w) 



P(k,w)(P*(w) 



P(k + l,w)\, (6) 



fc + 1 



where P w (w) = P(fc>') and P*(w) = 

T,k',w'>w P { k '^ W ') k '/( k )- The fist tw ° termS in E( l- 

correspond to the probability of a rewarder with stra 
egy w losing a donator due to a comparison with a ran- 
dom rewarder with a higher or equal value w' . The two 
remaining terms describe the converse probability of a re- 
warder receiving a donator from another rewarder with 
value w' < w. The time evolution of P(k,w) for a > a c 
is shown in Fig. |5j starting from a random configura- 
tion where all values of w are equally probable, and the 
in-degree distribution is a Poisson for all values of w. Ini- 
tially, the mass of the distribution shifts to lower values of 
w as agents adopt a value for money with a larger payoff. 
Simultaneously, the upper left portion of the distribu- 
tion increases in mass, since the rewarders with larger w 
receive more donators. Eventually the payoff of the re- 
warders with high w and k will be large enough to drive 
the entire distribution upwards. At this point, all re- 
warders will receive approximately the same number of 
donators, and the system will become susceptible to an 
invasion of low value for money. Due to the same dy- 
namics as before, the new front of low value for money 
will move upwards in the w axis, prompting the eventual 
appearance of yet another front, and so on. Although 
this corresponds to cycles of average value for money, the 
whole dynamics is aperiodic, and is not easy to predict 
when the next front will come, and how it will interact 
with the preceding ones. This dynamics of succeeding 
fronts proceeds indefinitely, and the system never settles 
on a fixed point. 

In conclusion, we have developed a minimal model of a 
trust game played on a population of agents, which dis- 
plays an emergent cartel-like behavior, with large fluctu- 
ations of value for money. As mentioned previously, this 
model provides a qualitative explanation for the price 
fluctuation of certain commodity prices, such as gasoline. 
As many empirical studies have shown |13| , the price of 
gasoline fluctuates between gas stations. The average 
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FIG. 5. Temporal evolution of P(k,w) 
grating Eq. [4] 
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price for a given city often exhibits daily variations, and 
sometimes fluctuates within the same day. The average 
price often rises very fast, and decay more slowly. This 
type of oscillation is called Edgeworth price cycles |14j . 
and is predicted by simple models involving two com- 
panies, which change their strategy at each round, re- 
acting to the strategy played by the other company at 
the previous round [T4| [T5j. This model, however, is not 
applicable in situations involving many companies. Our 
model not only assumes that there are many sellers, but 
also it incorporates the behavior of the buyers explicitly. 
The resulting oscillations which we observe are a result 
of the competition in the entire market, not the steady 
state behavior of very few companies which observe each 
other directly. Furthermore, it sheds light on the ques- 
tion of market regulation. It is often discussed if the 
observed fluctuations are a result of collusion among the 
gas companies, who attempt to increase the gas prices 
in unison [13J. Although this is certainly a possibility, 
our model shows that an explicit coordinated behavior 
among sellers is not an indispensable requirement for a 
cartel-like behavior. 
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ilar, with the only noticeable deviation being slightly 
different critical values a c marking the dynamical phase 
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